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In the classical parametric hypothesis testing problems, the asymp- 
totical null distribution of log-likelihood ratio test converges to the 
chi-square distribution independent of redundant parameters due to 
Wilks (1938). This phenomenon is not monopolized by fixed dimen- 
sion problems. Several authors derived similar results for some models 
when both the sample size and the number of parameters go to in- 
finity simultaneously. In this paper, we show Wilks type of theorems 
in the simple random graph models, which are known as the /3-model 
in the undirected case and the Bradley- Terry model in the directed 
case, when the number of parameters goes to infinity and the number 
of statistical experiments for each edge is a fixed constant. Numeri- 
cal studies and a data application are carried out to demonstrate the 
theoretical results. 



1. Introduction. In the classical parametric hypothesis testing prob- 
lems in which the number of parameters is assumed to be fixed and the 
sample size goes to infinity, one of the most well-known results is that 
the asymptotical null distribution of the minus twice log-likelihood ratio 
test converges to the chi-square distribution independent of redundant pa- 
rameters due to Wilks (1938), which is called as "Wilks phenomenon" 
by Fan, Zhang and Zhang (2001). This phenomenon is not monopolized by 
fixed dimension problems. Several authors derived similar results for infinity 
dimension problems in the sense that the asymptotic null distribution of the 
likelihood ratio statistic A is nearly the chi-square distribution with large 
degrees of freedom p n , i.e., 

-21ogA-p n 

— — > N{U,1), as p n — > oo. 



For instance, in a regular exponential family with a sequence of iid sam- 
ples X±, ■ ■ ■ ,X n and an increasing dimension p n , Portnoy (1988) derived 
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a Wilks type of result under a simple null when p n jn — > 0. More gen- 
erally, in a wide class of nonparametric problems, Fan, Zhang and Zhang 
(2001) proved Wilks type of results for generalized likelihood ratio tests. In 
this paper, we reveal Wilks type of theorems in the simple random graph 
models that are known as the /3-model in the undirected case (this name 
was coined by Chatterjee, Diaconis and Sly (2011)) and the Bradley- Terry 
model (Bradley and Terry (1952)) in the directed case, when the number 
of graphic vertices goes to infinity and the number of statistical experiments 
for each edge is a fixed constant. 

An earliest and simplest random graph model is due to Erdos and Renyi 
(1959) who put each edge for a set of vertices with an equal probability. 
This model has been widely and detailedly studied. See the monograph, 
Bollobas (2001). The degree distributions of the Erdos-Renyi model are ap- 
proximal Poisson distributions when the number of vertices goes to infinity. 
Thus, when a random graph presents strong clustering (i.e., the probability 
of two vertices being connected by an edge is higher than that of another two 
vertices) or has power-law degree distribution (i.e., some vertices have very 
large degrees), the Erdos-Renyi model may lack these characterizations. To 
fill these gaps, the exponential random graph models have been introduced, 
e.g., the pi distributions (Holland and Leinhardt (1981)) and more general 
p* distributions (Robins et, al. (2007)). Here, we focus on two simple but 
frequently used exponential models for random graphs without mutually 
edges and self-loops, which are known as as the /3-model in the undirected 
case and the Bradley- Terry model in the directed case, although there is cer- 
tainly hope for future progress. Moreover, we adopt a general sampling. Let 
denote the undirected edge between vertices i and j in the undirected 
case and the directed edge of i pointing to j in the directed case. Assume 
that the counts dij of edge come from mj Bernoulli statistical experi- 
ments with mutually independent outcomes. Therefore, dij ~ Bin(riij,pij) 
where pij is the occurrence probability of edge dy = dji in the undi- 

rected case and dij + dji = riij in the directed case. Define di = J2j^idij, 
which is the degree of i in the undirected case and is the out-degree of i 
in the directed case. In the present paper, assume = iV for all i / j, 
where N is a fixed positive constant. This assumption was considered by 
Simons and Yao (1999) and Rinaldo, Petrovic and Fienberg (2011). The 
detailed descriptions of the /3-model and Bradley- Terry model are given in 
the following: 

j3-model. The probability of vertex i connecting j is 



(1.1) 
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where fa is the influence parameter of vertex i. The likelihood function 
corresponding the /3-model is 

(1.2) Lp(f3) = n 



and the likelihood equations are 

where fa, i = 1, - • • , i are the MLEs of fa, i = 1, ■ ■ ■ ,t. 

Bradley- Terry model. The probability of vertex i pointing to j is 

(1-4) Pij = -a— l — a?, i,j = 1,--- ,t;i^j. 

e p% + e p ^ 



The likelihood function is 



Since the likelihood (1.5) can be represented as a function of t — 1 differences 
/Si+i — /3j, i = 2, • • • , t, for model identification we set fa = as a restriction. 
Another way to this end is letting J^i = !• Therefore the corresponding 
likelihood equations are 

(1.6) d * = E^ F> * = 2,-",t, 

where /3i = 0. 

The /3-model is lively in use for analyzing graphic and network data (see, 
e.g, Blitzstein and Diaconis (2009); Park and Newman (2004); Jackson 
(2008)). Its close Bradley- Terry model (Bradley and Terry (1952)) is widely 
applied to rank subjects in paired comparisons (see, e.g., the book by David 
(1988)), which itself was independently proposed by Zermelo (1929) and 
Ford (1957). When t is fixed and all n^- go to infinity, the consistency 
and asymptotical normality of the MLE in the /3-model and Bradley- Terry 
model are standard as well as Wilks type of theorems. In its reverse sce- 
narios that all riij are fixed and t goes to infinity, Simons and Yao (1999) 
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established the consistency and asymptotic normality of the MLE for the 
Bradley- Terry model; Chatterjee, Diaconis and Sly (2011) proved the con- 
sistency of fa for the /3-model. Yan and Xu (2012) have further proved the 
asymptotic normality of /3, in the /3-model. These results contrasts with the 
well-known Neyman-Scott problem (Neyman and Scott (1948)) where the 
maximum likelihood estimate of the variance fails to attain the consistency 
when the number of parameters goes to infinity. Since the MLEs in these 
two models still attain good asymptotic properties in non-class backgrounds, 
several questions come into our mind naturally. How about the performance 
of the likelihood ratio tests when t goes to infinity? Is there similar Wilks 
phenomena? These questions motivate the present paper. 

The remainder of this paper is organized as follows. Wilks type of the- 
orems are present in Section 2. Numerical studies and a data example are 
given in Section 3. Some discussion is put in Section 4. The proof of the 
theorem is in Section 5. Those supportive lemmas for the theorem are in 
Section 6. 



2. Main results. Define 
(2.1) 



L t = maxj |/3; |, M t = maxjj e ft ft 
v i;j = Var(dij) = Npij(l - pij) i / j. 

As discussed in Yan and Xu (2012) and Simons and Yao (1999), in order 
to guarantee the existence of MLEs for the equations (1.3) and (1.6), it is 
necessary to control the increasing rate of Lt- For convenience, we use Mt 
to measure the increasing rate instead of L< in the Bradley- Terry model, 
although e~ 2Lt < Mt < e 2Lt . The Wilks type of theorem under a simple null 
takes the form: 

Theorem 1. (1) For the (3-model, if L t = o(log(logi)) and 
Y l ■ , • ,-|2eft+ft - II Y f . , .,. e Pi+Pi 

2^=i;^ iff l = o(i) 2 ^ =1 ^j 6 = m 

then the log-likelihood ratio test log = \ogLp{p) — logLg(/3) is asymptot- 
ically normally distributed in the sense that 

(2) For the Bradley-Terry model, if 

t l/u t gft _ e p 3 £25/14 

M ' = Q ( n^2/7 ) and E I ft^J =°( 



(log t) 2 n ' 1 eft + eft 1 v (log i) 15 /7 • 
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then the log-likelihood ratio test log = logL^(/3) — logL^(/3) is asymp- 
totically normally distributed in the sense that 

21o g A 6; -(t-l) 4 
y/2{t - 1) 

Let Vt = (vij)i t j=i t ... t denote the covariance matrix of d\,--- ,d t and 
Vt-i = {vij)i ! j=2,— ,t denote the covariance matrix of • • • ,dt, where 

Vii = Yl kj L,i N Pik( l ~ Pik) » = 1, ■■■ ,t, 

and for 1 < i ^ j < t, 

{Vij for /3-model 
— Vij for Bradley- Terry model. 

Note that Vt is also the Fisher information matrix of f3. Let 

i-l t 
0^ — ^ ^ C^ij ; — d{ (Zj — ^ d{j . 

j=l J=i+1 

Note that aj is independent of 6j. Theorem 1 comes from the following 
lemmas, whose proofs are regelated to Section 6. For convenience, define 
m = varsxi j pij{\ — pij) and M = max^ j Pij(l — Vij) — 1/4- In the /3- model, 
m > e 2Lt /(l + e 2Lt ) 2 ; in the Bradley-Terry model, m > M t /(1 + M t ) 2 . 

Lemma 1. If M/m = o(t 1 / 6 ), then the following hold: 

(1) J2i=i( a i ~ E(ai)) 2 1 va is asymptotically normal distributed with mean 
E*=i Ey=i Vij/vu and variance Yn=x Var[(ai - E(ai)) 2 /va]. 

(2) J2i=i(bi — E{bi)) 2 j va is asymptotically normal distributed with mean 
YX=i Tfj=i+i Vij/vu and variance Ya=i Ta=i Var[(bi - E(bi)) 2 /va]. 

(3) J2i=i( a i ~ E{ai))(bi — E(bi))/vii is asymptotically normal distributed 
with mean and variance Yh=i Z)}=i J2k=i+i ^ij^ik/ v ii- 

(4) J2i=i(di — E(di)) 2 jva is asymptotically normal distributed with mean 
t and variance It. 

Lemma 2. If M/m = o(t 1//6 ), then the following hold: 

(1) For the fi -model, the test statistic (d— E(d)) T V t ~ 1 (d — E(d)) is asymp- 
totically normally distributed with mean t and variance 2t. 

(2) For the Bradley-Terry model, the test statistic (dt_i— E(dt^i)) T V t z\(dt~i — 
E(dt-i)) is asymptotically normally distributed with mean t — 1 and 
variance 2(£ — 1) . 
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Now, we extend Theorem 1 to a case including redundant parameters to 
test if a subset of parameters are equal. Without loss of generality, we assume 
that the null takes the form: H : ft = • • • = /3 m . Let $ res = (/3[ es , • • • , /3[ es ) 
be the maximum likelihood estimate of (3 under Hq. 

Theorem 2. Assume that m/t > r, where r 6 (0,1] is a positive con- 
stant. 

(1) For the /3-model, if Lt = o(log(logt)) and 

tyiogt {h ts/2/ bg t 

i/ie log-likelihood ratio test log Ao = log Lp(f3) — log Lg(/3 res ) is asymptoti- 
cally normally distributed in the sense that 

(2.2) 21 gAo_-m 4iV(M) 

(2) For the Bradley- Terry model, if 

, ^ , ,e^-e?i, . (t-m) 25 / 14 . 

M * = °(n — 77277 and 1^ a . , s . \= ° n — 77 vusTY ' 

(logt) 2 / 7 ij=7^+i e s » + e s 3 (log(t - m)) 15 / 7 

i/ie log-likelihood ratio test 2 log Ao = logL&t(/3) — logLfo(/3 res ) zs asymptot- 
ically normally distributed in the sense that 

(2.3) 2 1 o g A -(m-l) 4 

v/2(m - 1) 

Remark 1. Although the Wilks type of theorems for the /3-model and 
the Bradley-Terry model are established on the condition that all nij = N, 
i j, the results can be easily extended to a situation that 1 < < N . 

3. Simulation. We carry out simulations to evaluate Theorems 1 and 
2. The parameters were set to be N = 1, t = 30, {3^ = kLt/t, k = 1, • • • , t. If 
the MLE doesn't exist, we define the LRTs to be infinity. The simulated and 
asymptotical distributions of the LRTs were drawn in Figure 3. From these 
figures, we can see that the simulated distributions of the LRTs are very 
close to the standardized normal distribution under all simulated situations 
but when Lt = logt for the /3-model in which the nonexist MLE occurred 
with a large probability close to 1. These simulation results indicate that 
the condition Lt or M% in Theorems 1 and 2 may be relaxed greatly. 
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Fig 1. The simulated distributions of the LRTs. The real and dash lines present the sim- 
ulated and the normal distributions. The first and second rows are figures for the j3-model 
corresponding to the parameters m = 1 and m — t/2, respectively. The third and fourth 
rows are figures for the Bradley-Terry model corresponding to the parameters m = 1 and 
m — t/2, respectively. 



LRT(L,= 1) LRT(L, = t" 2 ) LRT(L,-t) 




Next, we simulated powers of the LRTs. The null takes the form Hq : 
f3\ = ■ ■ ■ = /3 m = and the alternative is assume to be H\ : = ic/m, 
i = 1, • • • , m. The left redundant parameters are set to be = (i — m)L t /t 
for i = m + 1, ■ ■ ■ ,t. The simulated results are put in Table 1. From Table 1, 
we can see that the powers become bigger as m increases when t is fixed and 
become smaller as t increases when m is fixed. When t = 30, the simulated 
type I errors look a bit higher than the nominal level; when t = 50, the LRTs 
control the type I errors well. Moreover, the powers under Lt = are a little 
higher than those under Lt = log(log(t)) for the /3-model but a bit lower 
than those under Lt = log(log(t)) for the Bradley- Terry model when t,m,c 
are fixed. It shows that that the redundant parameters have more or less 
influence on the powers. On the other hand, the powers for the /3-model are 
higher than those for the Bradley- Terry model under the same parameters. 
For instance, when c = 0.8, the powers for the /3-model exceeds 80% while 
those for the Bradley- Terry model are smaller than 50%. 
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Table 1 
Powers of the LRTs 



Powers for the /3-model 





t 




m 


c=0 


c=0.2 


c=0.4 


c=0.6 


c=0.8 


t= 


=30 





10 


0.063 


0.098 


0.288 


0.595 


0.866 








20 


0.062 


0.127 


0.438 


0.866 


0.990 






log (log t) 


10 


0.059 


0.100 


0.250 


0.541 


0.823 








20 


0.061 


0.120 


0.431 


0.854 


0.989 


t: 


=50 





10 


0.049 


0.129 


0.431 


0.833 


0.984 








20 


0.052 


0.1542 


0.634 


0.976 


1.000 






log (log t) 


10 


0.048 


0.109 


0.366 


0.752 


0.961 








20 


0.053 


0.148 


0.601 


0.961 


0.999 




t 


L t 


m 


c=0 


c=0.4 


c=0.8 


c=1.2 


c=1.6 


Powers for the Bradley- Terry model 


t= 


=30 





10 


0.058 


0.089 


0.221 


0.497 


0.751 








20 


0.054 


0.106 


0.323 


0.701 


0.940 






log (log t) 


10 


0.056 


0.094 


0.240 


0.523 


0.790 








20 


0.052 


0.107 


0.332 


0.695 


0.942 


t= 


=50 





10 


0.051 


0.110 


0.375 


0.736 


0.939 








20 


0.051 


0.137 


0.528 


0.916 


0.996 






log(logt) 


10 


0.053 


0.099 


0.366 


0.7480 


0.965 








20 


0.060 


0.145 


0.529 


0.9344 


0.998 



3.2 2008-09 NBA season 

National Basketball Association (NBA) is one of the most successful bas- 
ketball games all over the world. There are the total 30 teams in NBA, 
which are organized into two conferences: the western conference and the 
eastern conference. Each conference is composed of three divisions and each 
division has five teams. In the regular season, every team plays every other 
team three or four times and each team plays the total 82 matches. We use 
the 2008-09 NBA season data as an illustrated example. 

The fitted merits in the Bradley- Terry model are given in Table 2, in 
which Philadelphia 76ers is a referenced team. From this table, we can see 
that the ranking based on the win-loss percentage and the merits is similar. 
We use the test statistics to test if there are significant differences among the 
teams ranking from No. 3 to No. 10 for the Eastern conference and among 
the teams ranking from No. 2 to No. 9 for the Western conference. The 
values of the LRTs are 3.944 and -0.750 for the Eastern conference and the 
Western Conference with their p-values 8.002 x 10 -5 and 0.453, respectively. 
This indicates that there are significant difference for those eight teams in 
the Eastern conference and there aren't significant difference for those eight 
teams in the Western conference. 
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Table 2 
Merits of 2008-09 NBA season 





Eastern Conference 




Western Conference 






Team 


W-L 


Merit 


Team 


W-L 


Merit 


1 


Cleveland Cavaliers 


66-16 


4.532 


Los Angeles Lakers 


65-17 


4.158 


2 


Boston Celtics 


62-20 


3.462 


Denver Nuggets 


54-28 


2.058 


3 


Orlando Magic 


59-23 


2.745 


San Antonio Spurs 


54-28 


2.005 


1 


Atlanta Hawks 


47-35 


1.404 


Portland Trail Blazers 


54-28 


2.059 


5 


Miami Heat 


43-39 


1.146 


Houston Rockets 


53-29 


1.953 


6 


Philadelphia 76ers 


41-41 


1.000 


Dallas Mavericks 


50-32 


1.612 


7 


Chicago Bulls 


41-41 


1.002 


New Orleans Hornets 


49-33 


1.563 


8 


Detroit Pistons 


39-43 


0.899 


Utah Jazz 


48-34 


1.425 


9 


Indiana Pacers 


36-46 


0.794 


Phoenix Suns 


46-36 


1.284 


10 


Charlotte Bobcats 


35-47 


0.716 


Golden State Warriors 


29-53 


0.502 


11 


New Jersey Nets 


34-48 


0.682 


Minnesota Timberwolvcs 


24-58 


0.383 


12 


Milwaukee Bucks 


34-48 


0.697 


Memphis Grizzlies 


24-58 


0.387 


13 


Toronto Raptors 


33-49 


0.659 


Oklahoma City Thunder 


23-59 


0.349 


11 


New York Knicks 


32-50 


0.621 


Los Angeles Clippers 


19-63 


0.272 


15 


Washington Wizards 


19-63 


0.283 


Sacramento Kings 


17-65 


0.230 



4. Discussion. We have derived Wilks type of theorems for the /3- 
model and the Bradley- Terry models under a simple null when the statistical 
experiments are dense, i.e., riij = N for all pairs as t goes to infin- 
ity. Simulations suggested that there are still good approximations for the 
likelihood ratio tests in Theorems 1 and 2 when Lj = o(logt). Therefore it 
is interesting to see if the conditions in Theorems 1 and 2 can be relaxed. 
Moreover, we only consider the dense statistical experiments in this paper. 
Although they are reality, in some applications the statistical experiments 
may be sparse (See Yan, Yang and Xu (2011), Yan (Xu and Yang)). It is 
also interesting to see if the Wilks type of results continue to hold under 
sparse statistical experiments and what sparse conditions are imposed. 

5. Proof of Theorems. Before beginning the proof, we introduce one 
Lemma and one theorem referred to Yan (Xu and Yang) and 

Simons and Yao (1999). 

Lemma 3. (l)For the [i -model, let St = {sij)i,j=i,— ,t be 

% _ J_ 
vu v.. 

where v.. = J2j^i v ij an d $ij ^ s the Kroneckel delta function. Then we have 

( 1 + e 2L t) 6 



\ Wt:=v -i -s t \\ < 



8e 6L *(i- l) 2 ' 
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where \\A\\ denotes I for a general matrix A = (aij). 

(2) For the Bradley- Terry model, let St-i = (sy)i,j=2,- ,t be 

_ Sjj 1 

Then we have 

ANMf{l + NM t ) 



\W t -i :=V t z\- < 



(t - iy 



Theorem 5.1. (l)For the j3 -model, if L t = o(log(logt)), then $ uniquely 
exists with probability approaching one and is uniformly consistent in the 
sense that 



max |ft - fc| < O p I Cl e^ L \ rf ^ ] = 0p (l), 

l<i<n \ V t — 1 

where c\ , C2 , C3 are constants. 

(2) For the Bradley-Terry model, if Mt = o(y/t/ logi), i/ten /3 uniquely exists 
with probability approaching one and is uniformly consistent in the sense that 



max |e^- ft - 1| < max \e^ - \ < OMlJ 1 ^ — % = oJl). 

l<i<t l l<i,j<t l V t-1 7 PW 

Proof of Theorem 1. The proofs of Theorem 1 (1) and (2) are similar. 
We only present the proof of Theorem 1 (2). Let E be the event that the 
MLE in (1.6) exists and satisfies that 



'log(t - 1) 



(5.1) S t := max |ft - ft - 0j - < 0(BM t \. 

1,3 y t — 1 

By Theorem 5.1 (2), the event E holds with probability approaching one if 
Mt = o{\Jt/\ogt). The following calculations are based on the event E. 

Note that $1 = fi\ = in the Bradley- Terry model. Let t -i = ' ' ' ■> $t)i 
Pt-i = (02,- ■■ ,(k) and 

*(/3 t -i)=lQg£(A-i) = I>di- E ^log(^ 4 +e^)- 

i=l i<«<i<< 

By Taylor expansion, we have 

, , t0t-i)-t(Pt-i) = (d t ^-E(d t ^)) T ($ t _ 1 -0 t _ 1 ) 

1 ' } -U&t-i - Pt-i)%-i(Pt-i - Pt-i) + 
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where 



6 L^^=2 gpS [Pi - Pi) 

& 



and 



d 3 £ _ x , iV e& e& (e& - e& ) d 3 £ NeP* e& (e& - e& ) 

-Q^-Z^ ( e ft +e /3,)3 ' 9/3^/3,- ~ ( e ft +e ft)3 ' 

Similarly, we have 



l + e ft-ft 1 + + J (1 + e 



?,2 
'J 



(l + e ««)3 

where % = ft - ft - (ft - ft) and % = ft - ft + %(ft - ft), < % < 1. 
Let hij = Ne^(l - e^') 7^/(1 + e^') 3 and /i» = ^y- Then we have 

t 

(5.3) dj - E(dk) = VijKPi ~ ft) " (ft ~ ft)] +hi, i = l,---,t, 

3=1 

such that 

d t _i - £(d t _i) = V^iCA-i - /3 t _x) + ht_i, 

where h t _i = (/i 2 , • • • ,h t ) T . Substituting fi t _ x —fi t _ x = V t Zi[(d t -i—E(dt-i))- 
hf-i] into (5.2), it yields 

m-m = i(d t _ 1 - J E(d t _ 1 ))^z i 1 (d t _ 1 - J B(d t . 1 ))-ih^ 1 ^ i 1 h^ 1 +z. 

In view of Lemma 2, we only need to prove that 

(5.4) = o p (l) and = o p (l). 

in order to prove Theorem 1 (2). 

Note that \e x (l - e x )/(l + e x f\ < e x /(l + e x ) 2 < 1/4. According the 
definition of hij, we have 

(5.5) \hij\ < Nti%/4 and \hi\ < Nl ^ N ( t ~ ^t/^ 
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Since 

t t t t t 

J2(di ~ E(di)) = £ vijifij - fy) + £ hi = £ hi0i - ft) + £ 

i=2 i,j=2 i=2 j=2 i=2 

we have 

t t 
\Y t h i \ = \-{d 1 - E(d x )) - £ VjSj - fa) < \d! - E{d x )\ + v u S t . 

i=2 j=2 

It is easily checked that if M t = o(t), then (d\ — E{d\) 2 /vu = O p (l), by 
noting d\ = J2l=2 ^ s a sum °f a sequence of independent Binomial random 
variables. Consequently, by (5.5), we have 

hfrli'S't-iht-i 

^ "11 



16* ( * - 1} 6t X JVM t (t-l) + 2 + 2VUOt 



2 



< 0(M|(logt) 2 ). 
Therefore, by Lemma 3 (2) and the inequality (5.5), we have 

< 0{Mf{\ogtf) + ANMf{l + NM t ) max \h^ 

(5.6) < O p (M t 4 (logt) 2 ) + O p {Mj log t). 

It is easily checked that 

d 2 £((3 + 90 -(3)) JV , eft+^ft - e ft+9 ^ , 



^ JV ..eP'-eP*. 2S t 
( 5 - 7 ) < -T X ( L a, , ^. 1 + 



4 + eft 1 1-35/ 

According the definition of z, we have 

{\% 4 4 J Vl e ft+ e ft' i_ 35 / 

M t 3 (logt) 3 / 2 Ei 
(5.8) = O p \M?{\ogtY + 



t 3/2 
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By (5.6) and (5.8), if M t = o{t 1 '^ /{logtff 7 ) and £* i= i = o(t^/ u /(log i) 15 / 7 ), 

then we get (5.4). This completes the proof. □ 

Under the null H , we use the matrix V22 = (%0i,i=m+l,-,t denote the 
covariance matrix of d m +i,--- ,dt- Similar to Lemma 3 and Theorem 5.1, 
under the null Hq, we also have 

Lemma 4. (l)For the /3-model, let §22 = (sij)i,j=m+i,---,t be 

(5.9) s ij = Jl-L, 

vu v.. 

where v.. = J2i,j=m+i;j^i v ij ■ Then we have 

(-> , 2L t \6 

ll^'-gdl<o( 8 Ut (t _{,, )■ 

(2)For the Bradley-Terry model, let S22 = ( s ij)ij'=m+l — t ^ e 

(5.10) % = ^ + ^, 
where v\\ = J2i,j= m +i v ij ■ Then we have 

-1 5 r*f mM ?( l + 



|^22-5 2 2|| <0(- 



£ 2 



Theorem 5.2. ('ij = o(log(logt)) ; i/ten /3 res is uniformly consis- 

tent in the sense that 



max |/3r - < O^ie^ xM^) = Op(l), 



where 01,02,03 are positive constants. (2) For the Bradley-Terry model, if 
M t = o((t — m)/\og(t — m), then 



. ares ores /Ores /Qres . / lOSH £ — ?7l ) N , N 

max - ^ < OJ8M t \ = o p 1). 

i<i,j<t y t — m 

Proof of Theorem 2. Let d 1 = (d 2 , • • • , d m ), d 2 = (<i m +i, ■ ■ ■ , dt) and 

Vt - i:= \Vl2 V2I)' 
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where V\\ and V22 has the dimension (m — 1) x (m — 1) and (t — m) x (t — m), 
respectively. Note that under Hq, f3± = ■ ■ ■ = /3 m = and /3i = • • • = /3 m = 0. 
Similar to the proof of Theorem 1 (2), we have that 

£($ res ) - £((3 res ) = (d 2 -E(d 2 )) T 2 res -(3 2 res ) 

~^{.fires fires) ^22^fires fires) ^ 

(5.11) = \(d 2 -E(d 2 )) T V^\d 2 -E(d 2 ))-\tfV^ + z, 

where (3 res = m+ i,--- ,/3 f ), (3 2 es = (£ m+1 ,--- ,/3 f ), h = (h m+1 ,--- ,h t ) T 
and £ has the similar definition of h and z by setting the first m + 1 elements 
of /3 to be 0. 

Note that m/t > t > and r is a constant. Similar to the proof of 
Theorem 1 (2), we also have 

In^Vo^hl , , , \z\ 

, f = Op(l) and = op(l). 

Therefore, 

£(/3 res ) - ^(/3 res ) _ i(d 2 - E(d 2 )) T V^(d 2 - £(d 2 )) 



Op(l). 



V2(m - 1) ^(m-l) 
Similar to the proof of (6.20), we have 

(d 2 - j E(d 2 ))^ 2 - 2 1 (d 2 -^(d 2 )) 
m 

Consequently, we have 

£0 res ) - l(f3 res ) _ |(d 2 - ff(d 2 )f5 22 (d 2 - £(d 2 )) 
V2(m - 1) V2(m - 1) 

Note that 

(d-E(d)S t ^(d-E(df = j2 [ - dl ~ m))2 



+ Op(l). 



+ Op(l). 



1=1 



It 



(d 2 -^(d 2 )) T 5 22 (d 2 -£;(d 2 )) = y (di-m)? + (ET=i(di-m))f 



Moreover, it is easy to show (X)i=i(<^ ~~ E(di))) 2 /vqq = o p (l) and (<ii 
£(di)) 2 /t)n = o p (l). Since 
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we have 

2(£0) - £0 res )) - (m - 1) _ E£ 2 (rfj - Ejd^f/vu - (m - 1) 

V2(m-1) ^2(m - 1) ° M 

Similar to the proof of Lemma 1 (4), the main item of the right expression 
in the above equation is asymptotical normal. This completes the proof. □ 

6. Proof of Lemmas. For convince, we denote 

(6.1) x ik = d ik - Edik, k / i and xa = 0, 
and assume that 

mN < v ik < MN. 

The proofs of Lemma 1 (1) and (2) are similar. So we only give the proof 
of (1). 

Proof of Lemma 1(1). Let z$ = [(o$ — E(a,i)) 2 — E(ai — E{ai)) 2 ]/vu. 
A direct calculation gives that 

i-l i-l 

vlE(zf) = J2i Ex tk-(E(xl)) 2 ] + 2 ™ 

k=l l,k=l;tyk 

i-l i-l 

(6.2) = Vik((p ik - p k i) 2 + PikPki(l ~ PikPki)) +2 VikVil- 

k=l k,l=l;l^k 

Note that {zj}* =1 is a sequence of independent random variables. In order 
to prove (1), we only need to check E(z 2 ) < oo and the Lindeberg- Feller 
condition 

1 * 

(6.3) —J2E[zfl(\zi\>£G t ))^0, 

t i=l 

where G 2 = E(z 2 ). 
By (6.2), we have that 

(6.4) E{z 2 ) < — + 2, i = l, 

Vii 



and 



^ 2 YX,l=l ;lj=k VikVil 
i=l i=l 



g 2 = J2 e ^) > E 



v 2 - 

1 1 



(t- 1) 2 N 2 M 2 
2(t - 2)m 2 



(65) 2 UP 
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Let fx > 1 and v > 1 be two constants such that 1/ fj, + 1/u = 1. Note 
that di — E(di) is the sum of a sequence of independent random variables 
Xij,j = 1 , • • • , i — 1 with mean zeros and Xij can be viewed as a sum of N 
independent binary random variables x^l = 1, ■ ■ ■ , iV taking values — p«j 
and 1 — pij with respective probabilities 1 — pij and Py. By Rosenthal's 
inequality, thus we have that 

N N 
{=1 z=i 

and 

Eifli-Eoi)* < c A ^(E{x 2 lk ))^ + Y^EY^] 

k=l k=0 
i-1 i-1 

fc=i fc=i 

where is a constant depending only on Afj,. Consequently, 

£K-£q t )^ 2 (M/m) 2 ^ (M/m) 2 ^ 

1 j ^ " 4/i + 4 ^ [ (i - 1)2/^-1 + (* - 1)2^-1 J - 

For any given e > 0, by (6.5), we can choose Gt such that eGf > 1 for large 
enough t. Therefore by Hoffding's inequality, we have 

Pr(\ Zi \ > eG t ) = Pr((a t - Eatf > v vl (eG t + 1)) 
< 2exp(-2t; ii (eGt + l)/(t-l)) 

(6.8) < 2exp(-2eGt/M). 

Holder inequality gives 

(6.9) E[z 2 i I(\z i \>eG t )] < {Ez 2 i ») 1 '»{Pr{\z i \> eG t )) l / v 
Note that, 

(6.10) WW* < max{(^ (a -~f;- )4 " )^, i } . 

ii- 
ii 
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Combining (6.5), (6.7), (6.10) and (6.9), it yields 
± 1 Y.E[zp(\z l \>eG t )] 

t i=l 

3tM 2 ,-2e(t-l) 1 l 2 m, 

< — ; x — n X exp( ) 

~ 2(t-l)m 2 1 vM ' 

x maxlc^ + {M/m? + Wg/gg^ U 

x max-(c 4jU + c 4A1 { _ ^ 2 _ 1/fl + ^ _ )), L i- 

Since fi > 1 and u > 1 are constants, if M/m = o(i 1//6 ), then the above 
expression does go to zero as t goes to infinity. This completes the proof. □ 

Proof of Lemma 1 (3). We prove that J2i=i( a i- E ( a i)){h-E(bi))/vu 
is asymptotical normality by constructing a sequence of martingale. 

LetTi = (a i -E(a i ))(b i -E(b i ))/v ii +T i _ l ,i = 1, • • • ,t and = a(Tj,j = 
1, • • • where To is defined as zero. Since hi — E{bi) is independent of 
and dj — E(a,i), E[bi — E(bi)\a(ai — E(a,i), J^i-i)] = 0. Consequently, we have 

E{Ti\JFi-\) = Ti^ + ^-EKai-EiaMbi-Eibi^Ti-i] 

vu 

= Ti-i + —E[( ai - E(a t ))E[bi - E(bi)\a{ ai - E(oi), 7i_i)]| Fi-i] 
= T^x. 

So the sequence of {Tj}* =1 constructs a martingale. Thus, in order to prove 
(3), we only need to check the conditions on the martingale central limit 
theorem (Brown, 1971): 

(6 n) EjE(Pi ~ EbjfEjjai - Ea^F^/vl _^ 

Ht 



and 

(6.12) ^E^ 2 ^ l>£St} l^i))^0, V s>0. 



i=l 



where H t = E*=i E^-E^f E{ ai -E{ ai )) 2 /4 = £*=i ££i Efe=i+i % W«I 
and JCi = (bi - E(bi))(ai - E(ai))/vu. 
Note that, 

^ o ^ ,(t-i)m 2 (t + l)(t-l)m 2 (t-l)m 2 

*- ^ t 2 M 2 " 3tM 2 " 3M 2 ' 
i=i 
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Let Ki be E{b % - E{bi)) 2 E[{a, - E(a t )) 2 - E( ai - E(a t )) 2 \ Ti-i]/^. In order 
to prove (6.11), we need only to check 

Note, if M/m = o(t 1 / 6 ), by (6.4) then 
1* 2 1 « Efa - Ebjf , 2 (gj - £aQ 2 - gfo - i^Qj) 2 a 

- w§ ( — ^ — )m( ~, m 

1 * _ _ 9M 2 A , 1 



= 0(<^>W). 

Next we will prove (6.12). Note that E(IC 2 I^ i>£St j\Ti-i)) , i = 1, • • • ,t are 
nonnegative. Thus, we need only to prove that 



1 « E( ai - Eatfl^^Ejbi - Eb t ) 2 
Hoffding's inequality gives that 

Pr(\Ki\ > eH t ) = Pr(\(oi - Ea^b, - Ek)\ > EV»H t ) 

< Pr(\(ai - Eai)\ > y/evuHt) + PtQQh - Eh)\ > y/evuHt) 

< 2 exp(-2vueH t /i) + 2 exp(-2vueH t /(t - »)) 
(6.13) < 4exp(-2sH t m/M) 

Let rji = {fli — E(ai)) 2 /vu and fx > 1 and v > 1 be two constants such that 
+ \ jv = 1. By Holder inequality, 

(6.14) E[ril{\t i \>es t )] < (Erg> l ) 1 '»(E[I(\K i \> e*t)]) 1/v . 
Similar to (6.6), we have that 

i-l i-l 

E(a t -E( ai )) 2 » < c 2M (^%r + c| M ^(5f, + %), 

k=l k=l 

so that 

<«•> ^^^-H+^ + S^)- 
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Combining (6.13), (6.14) and (6.15), it yields 



1 * E{ ai - £(q,)) 2 I(|/Q| > es t )E(b t - E(bj))< 

< t:^[ E(ai ~J ai)2a ] 1/a (Prm\ > es t ))^ 
1=1 a t v ll 

< ^jUct + x AeM-2eH t m/(uM)) 
= O (exp(-2etm 3 /(uM 3 ) + log(M/m 

If M/m = o(t 1 / 6 ), the above expression does go to zero as t goes to infinity. 
This completes the proof. □ 

Proof of (4). Note that 

(dj - E(di)f = ( ai - E{ ai )f + (bi - E^)) 2 + 2(a 4 - E( ai ))(bi - E(h)). 

By Lemma 1 (1) to (3) and Slutsky's theorem, J2i=i(di — E(di)) 2 /vu is 
asymptotically normally distributed. Since E(J2i=i(di — E(di)) 2 /vu) = t, 
we only need to check 

(6.16) h NEM)W, 

in order to prove Lemma 1 (4). 

The variance of YA=i(di — E(di)) 2 jva is the sum of the following two 
terms: 

(a) YA=iVar[(di- E{di)f /va\- 

(b) 2E 1 < i<j < t Cov(^^^^^). 

A direct calculation gives that 

t n,- — 1 

(6.17) Var(di-Edi) 2 = Y][v ik {jp 2 ik -p ik p ki +pl^ + — v 2 k -3d 2 k ] + 2v 2 i . 

Consequently, 

Var(di-Edi) 2 n 

Vik + 3v 2 k M 3M 2 



ipt Var(di-EdiY _ ^ 

I Z " i=1 v ii I < v V — - OUik < — + 



i=l k=l n 



20 T. YAN, J. XU AND Y. YANG 

Thus, if M/m = o{t 1 / 2 ), 

£5-1 Varidi - Ediflva 
(6.18) ^*=± — — — o(l) + 2. 

Since \Cov((di — Edi) 2 , (dj - Edj) 2 )\ = \Cov(x 2 - 1 x 2 i )\ < 2dij + v 2 -, we have 

that, 

(6.19) 

1 \Sr\r^ A d i- Ed if ( d J ~ Ed i) 2 m ^ 1 V- V- 2 ^ + *1 < ( 8iV + 



By (6.18) and (6.19), if M/m = o(t 1/2 ), we have that (6.16). This completes 
the proof of Lemma 1 (4). □ 

Proof of Lemma 2. Since E(d- Ed) T W t {d- Ed) = tr(W t V t ) = 1, we 
only need to check that 



Var(Eh=i(di - Eidi^w^dj - E(dj))) 



oil) 



2t 

in order to prove 

r«9m (d - E(d)) T w t (d - E(d)) 

(6.20) = op(l). 

There are four cases for calculating the covariance gij^ v = Cov((ai—Eai)wij(aj- 
Ea,j), (a,£ — Ea^)w ( ^ ri (a r) — Ea v )). 
Case 1: i =j = ( = 77. By (6.17), 

t 

\guu\ < w 2 i (2v 2 i + Y / ^ + ^/N)vl+2v ik ) 

k=0 

< wl^f/Z + Nh/S + Nt/A); 
Similarly, we have that 

Case 2: only three indicates among the four indicates are the same (assume 
that j = C = v) 

hjjjl < \w ljWn \(N 2 t/8 + N 2 /4 + N/2y, 

Case 3: only two indicates among the four indicates are the same (assume 
that i = j or j = Q 

\guri(\ = \wuwc v (2vi£v iri + vuV£ v )\ < Iwuw^K^t/lG + N 2 /8); 
\gijj v \ = {wuWj^VjiVjr, + VijV jv )\ < 3\wiiW jv \N 2 /l6; 
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Case 4: All the four indicates are different 



9ij(v 



WijW^ v (vi^Vj v + Vi V Vj£)\ < 2\wijWc }ri \N 2 /16. 



Consequently, if Mt = o(t 1//6 ), then 

Var{d - Ed) T W t (d - Ed) ANM?(NM t + 1) 2 t(N 2 t 2 /8 + N 2 1/8 + Nt/4) 



(d - E(d)) T V t ~\d - E(d)) = (d - E(d)fS t (d - E(d)) + (d - £(d)) T ^(d - E(d)). 



Acknowledgements. This project is partially by the grants from Na- 
tional Science Found of China and National University of Singapore. 

References. 

Blitzstein, J. and DlACONlS, P. (2009). A sequential importance sampling algorithm for 

generating random graphs with prescribed degrees. 
Bollobas, B. (2001). Random Graphs (2nd ed.). Cambridge University Press. ISBN 

0521797225 

Bradley, R. A. and Terry, M. E. (1952). The rank analysis of incomplete block designs 

I. The method of paired comparisons. Biometrika 39 324-345. 
Brown, B. M. (1971). Martingale central limit theorems. Ann. Math. Statist, 42, 59-66. 
Chatterjee, S., Diaconis, P. and Sly, A. (2011). Random Graphs with a Given Degree 

Sequence, Ann. Appl. Probab., 21, 1400-1435. 
David, H. A. (1988). The Method of Paired Comparisons, Oxford University Press. 
Erdos, P. and Renyi, A. (1959). "On Random Graphs. I". Publicationes Mathematicae 

6: 290-297. 

Fan J., Zhang C, and Zhang J. (2001). Generalized likelihood ratio statistics and Wilks 

phenomenon, Ann. Statist. 29 153-193. 
Ford, L.R. Jr. (1957). Solution of a ranking problem from binary comparisons, The 

American Mathematical Monthly, 64(8), part 2, 28-33. 
Holland, P. W. and Leinhardt, S. (1981). An exponential family of probability distributions 

for directed graphs. J. Amer. Statist. Assoc., 76 33-65. 
Hoeffding, W. (1963). Probability inequalities for sums of bounded random variables. 

J. Amer. Statist. Assoc. 58 13-30. 
Jackson, M. O. (2008). Social and economic networks. Princeton University Press, Prince- 



2/ 




} 



Note that 



This completes the proof. 



□ 



ton, NJ. 



22 



T. YAN, J. XU AND Y. YANG 



LoeVE, M. (1977). Probability Theory I. 4th ed. Springer- Verlag, New York. 

Neyman, J. and Scott, E. L. (1948). Consistent estimates based on partially consistent 

observations. Sconometrica 16 1-32. 
Newman, M.E.J., Strogatz, S. H. and Watts D. J. (2001). Random graphs with arbitrary 

degree distributions and their applications, Physical Review E, 026118. 
Park, J. and Newman, M. E. (2004). Statistical mechanics of networks, Physical Review 

E, 066117. 

Portnoy, S.(1988). Asymptotic behavior of likelihood methods for exponential families 
when the number of parameters tends to infinity. ^4nn. Statist. 16 356-366. 

Rinaldo A., Petrovic S., Fienberg S. E. (2011) Maximum Likelihood Estimation in Network 
Models. Available at http://arxiv.org/abs/1105.6145 

Robins, G.L., Pattison, P.E., Kalish, Y., Lusher, D., (2007). An introduction to exponential 
random graph (p*) models for social networks. Social Networks 29, 173C191. 

Simons, G. and Yao, Y.-C. (1999). Asymptotics when the number of parameters tends to 
infinity in the Bradley- Terry model for paired comparisons. Ann. Statist. 27 1041-1060. 

Yan T., Yang Y., and Xu J. (2011). Sparse paired comparisons in the Bradley- Terry 
model. Statistica Sinica. Accepted. 

Yan T., Xu J. and Yang Y. (2011). Grouped sparse paired comparisons in the Bradley- 
Terry model. Manuscript. 

Yan T. and Xu J. (2012). A central limit theorem in the /3-model for undirected random 
graphs with a diverging number of vertices. Submitted. 

Wilks, S. S. (1938). The large-sample distribution of the likelihood ratio for testing 
composite hypotheses. Ann. Math. Statist. 9 60-62. 

Zermelo, E. (1929). Die Berechnung der Turnier-Ergebnisse als ein Maximumproblem 
der Wahrscheinlichkeitsrechnung. Math. Zeit. 29, 436-460. 

Department of Statistics and Finance Department of Statistics and Applied Probability 

University of Science and Technology of China National University of Singapore 

Anhui, 230026, China 6 Science Drive 2, Singapore 117546, Singapore 

E-mail: sunroomOmail. ustc.edu.cn E-MAIL: staxj@nus.edu.sg 

Department of Statistics and Finance 
University of Science and Technology of China 
Anhui, 230026, China 
E-MAIL: ynyang@ustc.edu.cn 



